[SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default#35113
[SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default#35113imback82 wants to merge 6 commits intoapache:masterfrom
Conversation
| if isSessionCatalog(catalog) => | ||
| if (name.length != 1) { | ||
| throw QueryCompilationErrors.invalidDatabaseNameError(name.quoted) | ||
| } |
There was a problem hiding this comment.
This check is handled in DatabaseNameInSessionCatalog.
| with command.TestsV1AndV2Commands { | ||
| override def namespace: String = "db" | ||
| override def notFoundMsgPrefix: String = "Database" | ||
| override def notFoundMsgPrefix: String = if (runningV1Command) "Database" else "Namespace" |
There was a problem hiding this comment.
We need to update the test now that v1 catalogs are run against both v1 and v2 commands (after adding if conf.useV1Command) .
There was a problem hiding this comment.
shall we move this to the base class as well?
|
cc @cloud-fan |
| super.test(testName, testTags: _*) { | ||
| // Need to set command version inside this test function so that | ||
| // the correct command version is available in each test. | ||
| setCommandVersion() |
There was a problem hiding this comment.
sorry I'm confused. We set the version right before super.test(....), do you mean it's being set again during super.test(....)?
There was a problem hiding this comment.
This def test doesn't run testFun when it's invoked; it only registers the test). So by the time, testFunc is actually run, _version will always be set to "V2" (useV1Command == false). So we need to capture this inside the lambda that's passed into super.test.
Also, note that we need to call setCommandVersion() before calling super.test because super.test being called is utilizing the commandVersion to set the right test name:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/CreateNamespaceSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
Outdated
Show resolved
Hide resolved
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/CreateNamespaceSuite.scala
Show resolved
Hide resolved
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
Outdated
Show resolved
Hide resolved
| "org.apache.hadoop.hive.metastore.api.AlreadyExistsException") | ||
| && exception.getMessage.contains( | ||
| s"Database ${dbDefinition.name} already exists")) { | ||
| Some(new DatabaseAlreadyExistsException(dbDefinition.name)) |
There was a problem hiding this comment.
Can we test it in the VersionsSuite, by updating the existing $version: createDatabase test? We need to make sure this wrap exception logic works for all hive versions.
There was a problem hiding this comment.
Updated the test. Thanks!
|
thanks, merging to master! |
### What changes were proposed in this pull request? This PR proposes to use V2 commands as default as outlined in [SPARK-36588](https://issues.apache.org/jira/browse/SPARK-36588), and this PR migrates `CREATE NAMESPACE` to use v2 command by default. Note that the work to tests covering both v1/v2 were done in apache#35093. ### Why are the changes needed? It's been a while since we introduced the v2 commands, and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands. ### Does this PR introduce _any_ user-facing change? The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog: v1 command: `Database 'db' already exists"` vs. v2 command: `Namespace 'db' already exists"` Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog: Before: `Database db already exists"` vs. After: `Database 'db' already exists"` ### How was this patch tested? Existing *CreateNamespaceSuite tests cover this PR's change. Closes apache#35113 from imback82/migrate_create_namespace. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…e/drop database to HiveClientImpl ### What changes were proposed in this pull request? #35113 introduced `HiveExternalCatalog.withClientWrappingException` to to wrap a Hive exception to a Spark exception. However, there was a limitation of testing it against different Hive versions as outlined in #35173 (comment). This PR proposes to revert `HiveExternalCatalog.withClientWrappingException` and move wrapping logic to `HiveClientImpl`. ### Why are the changes needed? 1. To make testing against different Hive versions better. 2. For Hive 0.12, the wrapping logic was not working correctly for `dropDatabase` because the message was not matching. ### Does this PR introduce _any_ user-facing change? Yes, for Hive 0.12, the exception message will now be consistent with other versions. Before: `InvalidOperationException(message:Database db is not empty)` After: `Cannot drop a non-empty database: db. Use CASCADE option to drop a non-empty database` ### How was this patch tested? Added a new test coverage. Closes #35173 from imback82/drop_db_withClientWrappingException. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…e/drop database to HiveClientImpl ### What changes were proposed in this pull request? apache#35113 introduced `HiveExternalCatalog.withClientWrappingException` to to wrap a Hive exception to a Spark exception. However, there was a limitation of testing it against different Hive versions as outlined in apache#35173 (comment). This PR proposes to revert `HiveExternalCatalog.withClientWrappingException` and move wrapping logic to `HiveClientImpl`. ### Why are the changes needed? 1. To make testing against different Hive versions better. 2. For Hive 0.12, the wrapping logic was not working correctly for `dropDatabase` because the message was not matching. ### Does this PR introduce _any_ user-facing change? Yes, for Hive 0.12, the exception message will now be consistent with other versions. Before: `InvalidOperationException(message:Database db is not empty)` After: `Cannot drop a non-empty database: db. Use CASCADE option to drop a non-empty database` ### How was this patch tested? Added a new test coverage. Closes apache#35173 from imback82/drop_db_withClientWrappingException. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This PR proposes to use V2 commands as default as outlined in SPARK-36588, and this PR migrates
CREATE NAMESPACEto use v2 command by default.Note that the work to tests covering both v1/v2 were done in #35093.
Why are the changes needed?
It's been a while since we introduced the v2 commands, and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands.
Does this PR introduce any user-facing change?
The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog:
v1 command:
Database 'db' already exists"vs.
v2 command:
Namespace 'db' already exists"Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog:
Before:
Database db already exists"vs.
After:
Database 'db' already exists"How was this patch tested?
Existing *CreateNamespaceSuite tests cover this PR's change.